Modeling the probability distribution of positional errors incurred by residential address geocoding
نویسندگان
چکیده
BACKGROUND The assignment of a point-level geocode to subjects' residences is an important data assimilation component of many geographic public health studies. Often, these assignments are made by a method known as automated geocoding, which attempts to match each subject's address to an address-ranged street segment georeferenced within a streetline database and then interpolate the position of the address along that segment. Unfortunately, this process results in positional errors. Our study sought to model the probability distribution of positional errors associated with automated geocoding and E911 geocoding. RESULTS Positional errors were determined for 1423 rural addresses in Carroll County, Iowa as the vector difference between each 100%-matched automated geocode and its true location as determined by orthophoto and parcel information. Errors were also determined for 1449 60%-matched geocodes and 2354 E911 geocodes. Huge (> 15 km) outliers occurred among the 60%-matched geocoding errors; outliers occurred for the other two types of geocoding errors also but were much smaller. E911 geocoding was more accurate (median error length = 44 m) than 100%-matched automated geocoding (median error length = 168 m). The empirical distributions of positional errors associated with 100%-matched automated geocoding and E911 geocoding exhibited a distinctive Greek-cross shape and had many other interesting features that were not capable of being fitted adequately by a single bivariate normal or t distribution. However, mixtures of t distributions with two or three components fit the errors very well. CONCLUSION Mixtures of bivariate t distributions with few components appear to be flexible enough to fit many positional error datasets associated with geocoding, yet parsimonious enough to be feasible for nascent applications of measurement-error methodology to spatial epidemiology.
منابع مشابه
Leading the charge for better batteries.
Background: The assignment of a point-level geocode to subjects' residences is an important data assimilation component of many geographic public health studies. Often, these assignments are made by a method known as automated geocoding, which attempts to match each subject's address to an address-ranged street segment georeferenced within a streetline database and then interpolate the position...
متن کاملAccuracy of two geocoding methods for geographic information system-based exposure assessment in epidemiological studies
BACKGROUND Environmental exposure assessment based on Geographic Information Systems (GIS) and study participants' residential proximity to environmental exposure sources relies on the positional accuracy of subjects' residences to avoid misclassification bias. Our study compared the positional accuracy of two automatic geocoding methods to a manual reference method. METHODS We geocoded 4,247...
متن کاملA research agenda: does geocoding positional error matter in health GIS studies?
Until recently, little attention has been paid to geocoding positional accuracy and its impacts on accessibility measures; estimates of disease rates; findings of disease clustering; spatial prediction and modeling of health outcomes; and estimates of individual exposures based on geographic proximity to pollutant and pathogen sources. It is now clear that positional errors can result in flawed...
متن کاملAccuracy of residential geocoding in the Agricultural Health Study
BACKGROUND Environmental exposure assessments often require a study participant's residential location, but the positional accuracy of geocoding varies by method and the rural status of an address. We evaluated geocoding error in the Agricultural Health Study (AHS), a cohort of pesticide applicators and their spouses in Iowa and North Carolina, U.S.A. METHODS For 5,064 AHS addresses in Iowa, ...
متن کاملEstimating Spatial Intensity and Variation in Risk from Locations Subject to Geocoding Errors
The accurate assignment of geocodes to the residences of subjects in a study population is an important component of the data acquisition/assimilation stage of a spatial epidemiological investigation. Unfortunately, however, it is not a simple matter to obtain accurate point-level geocodes. Recent investigations have demonstrated that when residential address geocoding is performed by the most ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- International Journal of Health Geographics
دوره 6 شماره
صفحات -
تاریخ انتشار 2007